On Large-Scale Retrieval: Binary or n-ary Coding?
نویسندگان
چکیده
The growing amount of data available in modern-day datasets makes the need to efficiently search and retrieve information. To make large-scale search feasible, Distance Estimation and Subset Indexing are the main approaches. Although binary coding has been popular for implementing both techniques, n-ary coding (known as Product Quantization) is also very effective for Distance Estimation. However, their relative performance has not been studied for Subset Indexing. We investigate whether binary or n-ary coding works better under different retrieval strategies. This leads to the design of a new n-ary coding method, ”Linear Subspace Quantization (LSQ)” which, unlike other n-ary encoders, can be used as a similarity-preserving embedding. Experiments on image retrieval show that when Distance Estimation is used, n-ary LSQ outperforms other methods. However, when Subset Indexing is applied, interestingly, binary codings are more effective and binary LSQ achieves the best accuracy.
منابع مشابه
A Fresh Look at Coding for q-ary Symmetric Channels
This paper studies coding schemes for the q-ary symmetric channel based on binary low-density parity-check (LDPC) codes that work for any alphabet size q = 2, m ∈ N, thus complementing some recently proposed packet-based schemes requiring large q. First, theoretical optimality of a simple layered scheme is shown, then a practical coding scheme based on a simple modification of standard binary L...
متن کاملThe fuzzy set model based on N-ary positively compensatory operators
We have enhanced the fuzzy set model by replacing MIN and MAX operators with binary positively compensatory operators. Though the binary operators provide higher retrieval eeectiveness, they can give diierent document values for logically equivalent queries, e.g. t1 AND (t2 AND t3) and (t1 AND t2) AND t3. This is because they do not satisfy the basic boolean processing laws such as distributive...
متن کاملEnumeration of sequences with large alphabets
A binary sequence of length n with w ones can be identified by its lexicographical rank in the set of all binary sequences with same number of ones and zeros, which is of size n! w!·(n−w)! . Although that enumeration has been deeply studied for binary case, it is less addressed for σ-ary sequences, where σ > 2. Assuming n is a fixed predetermined parameter, the enumerative coding of a given n-s...
متن کاملDeep Hashing Network for Efficient Similarity Retrieval
Due to the storage and retrieval efficiency, hashing has been widely deployed to approximate nearest neighbor search for large-scale multimedia retrieval. Supervised hashing, which improves the quality of hash coding by exploiting the semantic similarity on data pairs, has received increasing attention recently. For most existing supervised hashing methods for image retrieval, an image is first...
متن کاملModified Gray-Level Coding Method for Absolute Phase Retrieval
Fringe projection systems have been widely applied in three-dimensional (3D) shape measurements. One of the important issues is how to retrieve the absolute phase. This paper presents a modified gray-level coding method for absolute phase retrieval. Specifically, two groups of fringe patterns are projected onto the measured objects, including three phase-shift patterns for the wrapped phase, an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1509.06066 شماره
صفحات -
تاریخ انتشار 2015